Viewpoint Paper: Repurposing the Clinical Record: Can an Existing Natural Language Processing System De-identify Clinical Notes?

نویسندگان

  • Frances P. Morrison
  • Li Li
  • Albert M. Lai
  • George Hripcsak
چکیده

Electronic clinical documentation can be useful for activities such as public health surveillance, quality improvement, and research, but existing methods of de-identification may not provide sufficient protection of patient data. The general-purpose natural language processor MedLEE retains medical concepts while excluding the remaining text so, in addition to processing text into structured data, it may be able provide a secondary benefit of de-identification. Without modifying the system, the authors tested the ability of MedLEE to remove protected health information (PHI) by comparing 100 outpatient clinical notes with the corresponding XML-tagged output. Of 809 instances of PHI, 26 (3.2%) were detected in output as a result of processing and identification errors. However, PHI in the output was highly transformed, much appearing as normalized terms for medical concepts, potentially making re-identification more difficult. The MedLEE processor may be a good enhancement to other de-identification systems, both removing PHI and providing coded data from clinical text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unlocking echocardiogram measurements for heart disease research through natural language processing

BACKGROUND In order to investigate the mechanisms of cardiovascular disease in HIV infected and uninfected patients, an analysis of echocardiogram reports is required for a large longitudinal multi-center study. IMPLEMENTATION A natural language processing system using a dictionary lookup, rules, and patterns was developed to extract heart function measurements that are typically recorded in ...

متن کامل

Towards the Creation of a Large Corpus of Synthetically-Identified Clinical Notes

Clinical notes often describe the most important aspects of a patient’s physiology and are therefore critical to medical research. However, these notes are typically inaccessible to researchers without prior removal of sensitive protected health information (PHI), a natural language processing (NLP) task referred to as deidentification. Tools to automatically de-identify clinical notes are need...

متن کامل

Automatic Mapping Clinical Notes to Medical Terminologies

Automatic mapping of key concepts from clinical notes to a terminology is an important task to achieve for extraction of the clinical information locked in clinical notes and patient reports. The present paper describes a system that automatically maps free text into a medical reference terminology. The algorithm utilises Natural Language Processing (NLP) techniques to enhance a lexical token m...

متن کامل

Correlating Lab Test Results in Clinical Notes with Structured Lab Data: A Case Study in HbA1c and Glucose

It is widely acknowledged that information extraction of unstructured clinical notes using natural language processing (NLP) and text mining is essential for secondary use of clinical data for clinical research and practice. Lab test results are currently structured in most of the electronic health record (EHR) systems. However, for referral patients or lab tests that can be done in non-clinica...

متن کامل

Electronic medical records for clinical research: application to the identification of heart failure.

OBJECTIVE To identify patients with heart failure (HF) by using language contained in the electronic medical record (EMR). METHODS We validated 2 methods of identifying HF through the EMR, which offers transcription of clinical notes within 24 hours or less of the encounter. The first method was natural language processing (NLP) of the EMR text. The second method was predictive modeling based...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 16 1  شماره 

صفحات  -

تاریخ انتشار 2009